Please consider subscribing to LWN Subscriptions are the lifeblood of LWN.net. If you appreciate this content and would like to see more of it, your subscription will help to ensure that LWN continues to thrive. Please visit this page to join up and keep LWN on the net. |
"Control-flow integrity" (CFI) is a set of technologies intended to prevent an attacker from redirecting a program's control flow and taking it over. One of the approaches taken by CFI is called "indirect branch tracking" (IBT); its purpose is to prevent an attacker from causing an indirect branch (a function call via a pointer variable, for example) to go to an unintended place. IBT for Intel processors has been under development for some time; after an abrupt turn, support for protecting the kernel with IBT has been merged for the upcoming 5.18 release.
The kernel, like many C programs, makes extensive use of indirect branches. As a simple example, consider system calls; user space provides a number indicating which system call is required, and the kernel responds by looking up the appropriate function from a table (using that number) and calling that function via an indirect branch. Function pointers abound in the kernel; among other things, they are used to implement its vaguely object-oriented programming model.
If an attacker is able to somehow corrupt a variable that is used for indirect branches, they may be able to redirect the kernel's execution flow to an arbitrary location. That could result in unintended function calls; on complex processors like x86, it is also possible to get interesting results by jumping into the middle of a multi-byte instruction. Exploit techniques like return-oriented programming and jump-oriented programming depend on this kind of redirection.
IBT is meant as a defense against jump-oriented programming; it works by trying to ensure that the target of every indirect branch is, in fact, intended to be reached that way. There are a number of approaches to IBT, each with its own advantages and disadvantages. For example, the kernel gained support for a compiler-implemented IBT mechanism during the 5.13 development cycle. In this mode, the compiler routes every indirect branch through a "jump table", ensuring that the target is not only meant to be reached by indirect branches, but that the prototype of the called function matches what the caller is expecting. This approach works, at the cost of a fair amount of compile-time and run-time overhead.
The Intel IBT approach is rather simpler, but it has the advantage of being supported by the hardware and, as a result, being faster. If IBT is enabled, the CPU will ensure that every indirect branch lands on a special instruction (endbr32 or endbr64), which executes as a no-op; if anything else is found, the processor will raise a control-protection (#CP) exception. Unlike the more complete scheme described above, IBT cannot ensure that the target of an indirect branch matches the caller's expectations, but it can ensure that the target was meant to be reached in this way.
Turning on a mechanism like this will only work if every possible target of an indirect branch begins with one of the endbr instructions. For the most part, this task can be handled by the compiler; both GCC (as of GCC 9) and Clang (as of version 14) implement the -fcf-protection=branch option and will insert these instructions when it is present. That doesn't help with all of the assembly code in the kernel, though. So the bulk of the work (in terms of changesets) is devoted to adding endbr instructions wherever they seem to be needed.
One other small complication comes about when the kernel calls into somebody else's code, which may not have been built with IBT in mind. The kernel does not call outside code often, but one big exception is the system's firmware, which must often be invoked to carry out specific functions. To be safe, the kernel makes a point of turning off IBT around calls into firmware. The current implementation also turns off IBT when giving control to user space.
The need to add endbr instructions to all indirect jump targets sets a potential trap for the future; developers may add assembly functions and forget that instruction. If they do their testing without IBT enabled, the omission will not be noticed, and it may not pop up until some extremely inconvenient time after the faulty work has been merged. To prevent this eventuality, the kernel's objtool utility has been enhanced to check all indirect branches and ensure that all targets are appropriately annotated.
With that checking in place, though, there's another step that can be taken: objtool can also make a list of all functions containing endbr instructions that can never be called via an indirect branch. Those functions do not need that annotation, and the kernel would be a little more secure without them. So the kernel build process takes that list from objtool and "seals" those functions by overwriting the endbr with a nop4 instruction. That reduces the number of targets an attacker can still choose from when IBT is enabled.
As Peter Zijlstra pointed out, there is another, perhaps surprising advantage to removing the unneeded endbr instructions. The kernel limits the functions that are available to loadable modules, and proprietary modules are limited even further. It is a common technique for proprietary modules to look up the non-exported functions they need in the kernel's symbol table, then call them via an indirect branch, thus bypassing the kernel's limitations. But, with IBT enabled, any function lacking an endbr instruction will no longer be callable in this way.
The effort to get Intel IBT support into the Linux kernel has been ongoing for some time; the first patches implementing support (for user-space code rather than for the kernel) were posted by Yu-cheng Yu in 2018. This work then seemingly became one of those flying-Dutchman patches that continually cross the mailing lists without ever managing to land in the mainline; version 30 was posted in August 2021 and seemed no closer to merging. A similar fate befell the user-space shadow-stack patches, which were recently taken over by Rick Edgecombe after many previous revisions.
Late last year, Peter Zijlstra decided to create a separate Intel IBT implementation to protect the kernel itself; the first version was posted last November after Zijlstra evidently "hacked this up on Friday night / Saturday morning". The work evolved quickly, and the fourth revision, posted in early March, is the code that was merged for 5.18.
That is where things stand today. IBT is supported, for kernel code only,
in Intel processors starting with the Tiger Lake generation,
which hit the market in late 2020. It is not a perfect tool, but it will
raise the bar for attackers on systems where it is present and enabled.
Meanwhile, it is not clear when (or whether) user-space support will find its way
into the kernel; many of the 30 revisions posted so far have received no
comments at all.
Index entries for this article | |
---|---|
Kernel | Releases/5.18 |
Kernel | Security/Control-flow integrity |
Posted Mar 31, 2022 16:35 UTC (Thu) by Hello71 (subscriber, #103412) [Link]
Why can't the crap module simply turn off IBT before calling the function (and hopefully turn it back on afterwards)?
Posted Mar 31, 2022 18:19 UTC (Thu) by JoeBuck (subscriber, #2330) [Link]
Posted Mar 31, 2022 18:28 UTC (Thu) by mb (subscriber, #50428) [Link]
func = kallsym_lookup_name("unexported_function");
(*func)(args);
would also be a DMCA violation already.
The hypothesis was: IBT could to be used to technically prevent the illegal kallsym_lookup_name.
But can it prevent it?
Posted Mar 31, 2022 20:25 UTC (Thu) by donald.buczek (subscriber, #112892) [Link]
There are legitimate reasons to call a non-exported kernel function from a module. E.g. when you create a module to install a ftrace-based wrapper around a kernel function with a security problem because you can't immediately reboot into a fixed kernel for one reason or another and you are not prepared for live patching.
We recently had to do that [1] and needed to work around a missing kallsym_lookup_name, which we did with register_kprobe.
The attempt of the kernel to restrict modules reminds me of DRM. You don't really succeed, bad guys work around anyway, but you make life harder for legitimate users.
[1]: https://github.molgen.mpg.de/mariux64/fix-lpp/blob/main/f...
Posted Apr 2, 2022 1:45 UTC (Sat) by developer122 (subscriber, #152928) [Link]
Half-object because there would be *some* maintenance burden to making changes for out of tree code. The kernel already has to periodically sync against other upstream projects whose code it uses.
Posted Apr 6, 2022 16:34 UTC (Wed) by immibis (subscriber, #105511) [Link]
Posted Apr 6, 2022 16:32 UTC (Wed) by immibis (subscriber, #105511) [Link]
I believe common opinion is settled: exported functions form an arms-length public API, but if you go digging in deeply, you're a derivative work. Of course, that has not been tested in court.
Posted Apr 6, 2022 18:04 UTC (Wed) by nybble41 (subscriber, #55106) [Link]
From 17 U.S.C. § 101 <http://www.copyright.gov/title17/92chap1.html#101>:
> A “derivative work” is a work based upon one or more preexisting works, such as a translation, musical arrangement, dramatization, fictionalization, motion picture version, sound recording, art reproduction, abridgment, condensation, or any other form in which a work may be recast, transformed, or adapted. A work consisting of editorial revisions, annotations, elaborations, or other modifications, which, as a whole, represent an original work of authorship, is a “derivative work”.
What all the examples have in common is that they actually incorporate the original work—not just a reference but the actual creative expression in some form or another—as part of the derivative work. Source code or dynamically linked object code which merely *references* some external API is not a "recast, transformed, or adapted" version of that other software in the form in which it is distributed and does not resemble any of the examples given here.
I am not your lawyer, this is not legal advice, yada yada, but IMHO the idea that your original loadable kernel module should be treated as a derivative work of the kernel just because it *references* certain kernel APIs by name is completely without basis under any reasonable interpretation of US copyright law.
Posted Apr 6, 2022 23:21 UTC (Wed) by immibis (subscriber, #105511) [Link]
Posted Apr 7, 2022 0:28 UTC (Thu) by sfeam (subscriber, #2841) [Link]
"and the GPL is pointless" - Maybe you meant the LGPL? In which case I agree with you; the LGPL is in essence identical to the GPL with the addition of a promise not to get into the above endless argument.
Posted Apr 7, 2022 15:22 UTC (Thu) by Wol (subscriber, #4433) [Link]
Cue endless bikeshedding about the meaning of "non-trivial" :-)
Cheers,
Wol
Posted Apr 7, 2022 16:04 UTC (Thu) by nybble41 (subscriber, #55106) [Link]
a) Non-trivial code shouldn't be inlined. b) This has no bearing on the question of "GPL-only" kernel symbols as inline code isn't accessed through the symbol table. c) Inline code in headers files would have no effect on modules distributed in source form (including obfuscated source), as the source distribution doesn't include those headers. d) If the inclusion of certain nontrivial inline code is required for the interoperability for a binary module distribution this could reasonably be argued to be fair use.
Posted Apr 11, 2022 11:09 UTC (Mon) by LtWorf (subscriber, #124958) [Link]
Like there is a statically linked file on disk, there is a statically linked file in RAM.
Posted Apr 11, 2022 14:04 UTC (Mon) by anselm (subscriber, #2796) [Link]
Dynamic linking is static after the linking has happened.
Yes, but the result isn't distributed to others, so copyright doesn't kick in. FWIW, the GPL explicitly says that, on your own machine, you get to do whatever you want with the code, which would presumably include dynamically linking stuff to it.
IOW, I'm perfectly free to take my own copy of Harry Potter and replace all occurrences of “Albus Dumbledore” with “Alfred E. Newman”. As long as I don't distribute the result to anybody else, the fact that this has created a “derived work” of the original book is of concern to nobody except myself.
Posted Apr 12, 2022 4:20 UTC (Tue) by calumapplepie (subscriber, #143655) [Link]
Posted Apr 8, 2022 14:10 UTC (Fri) by kronat (subscriber, #117266) [Link]
s/separate files/in the cloud, and problem solved, without caring about linking.
Posted Apr 8, 2022 14:29 UTC (Fri) by immibis (subscriber, #105511) [Link]
Posted Apr 11, 2022 15:14 UTC (Mon) by flussence (subscriber, #85566) [Link]
I think Microsoft would vastly prefer the opposite to be true, given that they could've sued Wine users out of existence for using their ABIs if it were.
Posted Mar 31, 2022 19:02 UTC (Thu) by sdalley (subscriber, #18550) [Link]
Posted Mar 31, 2022 19:49 UTC (Thu) by Sesse (subscriber, #53779) [Link]
Posted Apr 1, 2022 4:02 UTC (Fri) by milesrout (subscriber, #126894) [Link]
Posted Apr 1, 2022 4:56 UTC (Fri) by donald.buczek (subscriber, #112892) [Link]
[1]: https://wonky.computer/post/make-linux-fast-again/
[2]: https://www.linux-community.de/ausgaben/linuxuser/2019/08... (german)
Posted Apr 1, 2022 7:35 UTC (Fri) by Villemoes (subscriber, #91911) [Link]
That mechanism also proactively prevents attackers from gaining control over the machine by inducing NULL pointer derefs (and who knows what other malfunctions) in perfectly fine C code. If you consider enabling that, make sure none of the code included in your build relies on function pointer (in)equality testing.
Posted Apr 1, 2022 9:46 UTC (Fri) by andy_shev (subscriber, #75870) [Link]
Posted Apr 1, 2022 10:28 UTC (Fri) by jamescrake-merani (subscriber, #157540) [Link]
Copyright © 2022, Eklektix, Inc.
This article may be redistributed under the terms of the
Creative
Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds